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ABSTRACT 

A  study  devoted  to  a  comparison  of  exponential  smoothing  with  other 
alternatives  to  demand  forecasting.   Special  attention  is  paid  to  the 
stock-out  risks  assumed  whenever  reorder  levels  are  set  using  the  various 
methods  being  compared.   Models  presently  used  by  NavSup  are  employed 
in  order  that  the  results  be  applicable  to  the  system  in  use.   Simulation 
techniques  are  used  for  drawing  comparisons.   For  constant  mean,  normal 
demand,  it  is  shown  that  exponential  smoothing  does  not  produce  as 
accurate  results  as  ordinary  maximum  likelihood  techniques.   For  the 
case  of  a  linear  mean  changing  with  time,  it  is  shown  that  the  two  methods 
are  about  comparable.   Finally,  a  sequential  Bayes  forecasting  method  is 
defined  and  found  to  compare  quite  favorably  with  exponential  smoothing. 
The  need  for  additional  study  of  Bayesian  methods  is  established. 

This  task  was  supported  by  the  Research  and  Development  Division, 
Naval  Supply  Systems  Command. 
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1.   INTRODUCTION. 

In  two  previous  reports  ([1]  and  [2])  a  rather  detailed 
examination  of  some  of  the  aspects  of  exponential  smoothing  as  a 
demand  forecasting  tool  was  presented.   In  particular,  special  atten- 
tion was  paid  to  the  manner  in  which  reorder  levels  are  affected  in  a 
variety  of  forms  using  models  presently  employed  by  NavSup  and 
originally  generated  by  R.  G.  Brown  [3]. 

In  the  case  of  a  normal  demand  with  constant  mean  and  variance 
(high  mover,  low  value  items)  the  results  of  setting  reorder  levels 
using  exponential  smoothing  were  compared  with  chose  obtained  using 
classical  maximum  likelihood  techniques.   Because  of  the  intractability 
of  the  probability  distributions  involved  using  exponential  smoothing, 
simulation  techniques  had  to  be  used  for  comparing  these  methods . 
While  such  methods  fail  to  produce  absolutely  conclusive  results,  the 
overwhelming  evidence  favoring  maximum  likelihood  over  exponential 
smoothing  in  every  case  examined  can  hardly  be  taken  lightly.   The 
results  really  were  not  surprising.   As  previously  pointed  out,  when- 
ever the  Gauss-Markov  assumptions  apply,  as  they  do  in  these  models, 
almost  any  departure  from  maximum  likelihood  methods  is  doomed  to  be 
second  best,  at  most.   Yet,  on  the  practical  side,  one  can  ask,  'How 
bad  off  is  second  best?"  and,  "Is  there  a  trade-off  perhaps  between 
optimality  and  some  other  desirable  facets  such  as  reduced  computation 
time  or  perhaps  ease  of  understanding?"  Again,  attempts  were  made 
to  answer  these  questions  by  having  NavSup  personnel  choose  the  criterion 


and  then  draw  comparisons  on  that  criterion.  For  the  models  studied, 
very  little  beyond  the  intuitive  appeal  of  weighting  previous  demands 
with  the  highest  weight  going  to  the  most  recent  demand  could  be  said 
for  exponential  smoothing. 

To  be  more  specific,  previous  studies  focused  on  the  case  where 
demand  in  a  period  is  normally  distributed  with  mean  y   and  standard 
deviation  a.   From  period  to  period,  such  demands  are  independent 
but  always  with  this  same  probability  distribution.   If   u   and  a 
were  known,  then  it  would  be  a  relatively  simple  matter  to  set  a 
reorder  level  to  apply  period  by  period  in  order  to  achieve  a  specified 
stockout  risk.   Indeed,  if  X  represents  random  demand  in  the  period 
to  come  and  a  stockout  risk  of   p   is  specified,  then  the  reorder 
level  should  be  set  at   u  +  ka  where  the  constant  k  is  determined 
from  the  simple  relationship, 

(1.1)  p  =  P(X>u+ka) 

Since  this  can  be  immediately  translated  into 


(1.2)  p  =  P(^  >  k) 


and  — —  is  the  standard  or  tabled  normal  random  variable,  it  is  a 

a 

trivial  task  to  match  k  with  p   by  means  of  a  normal  table.   For 
example,  if  p  =  .05,   then  k  =  1.645,   while  if   p  =  .10,   then 
k  =  1.282   and  so  on.   Obviously,  choosing  larger  and  larger  values 


of   k  guards  against  being  out  of  stock,  but  only  at  the  expense 
perhaps  of  holding  excessive  stock  on  hand.   The  difference  in  the  con- 
sequences of  these  two  standard  undesirable  conditions  will  have  to 
guide  one's  choice  of   p   hence  k. 

The  difficulty  is  that  even  if  the  model  applies,  the  param- 
eters \i      and   a   are  rarely  known.   This  means  that  they  will  have 
to  be  estimated  and  when  these  estimates  are  used  to  set  the  reorder 
level,  there  is  no  longer  any  guarantee  that  the  specified  value  of 
p   in  (1.1)  is  satisfied.   This  is  true  regardless  of  how  u   and  a 
are  estimated  and  is  just  one  of  those  statistical  facts  of  life. 
The  true  risk  that  is  faced  thus  depends  upon  the  joint  probability 
distribution  of  the  estimators  involved  and  may  or  may  not  depart 
significantly  from  the  intended  risk.   Or  if  you  prefer,  the  actual 
costs  of  being  out  of  stock  will  eventually  be  observed  to  depart 
from  what  was  supposed  to  be  the  case  because  of  the  fact  that  the 
estimated  reorder  level  is  not  the  theoretical  one  specified  by  (1.1). 

This  being  the  case,  the  precision  with  which  u   and   a  are 
estimated  becomes  an  extremely  important  factor.   And  here  is  pre- 
cisely where  exponential  smoothing  begins  to  lose  contests,  at  least 
in  the  normal  models  that  have  been  examined.   The  numerical  results 
in  all  of  those  cases,  coupled  with  some  theoretical  results  to  be 
reported  presently,  indicates  that  exponential  smoothing  always  seems 
to  be  more  variable  than  classical  maximum  likelihood.   What  is  worse, 
that  variance  does  not  improve  with  time,  is  a  function  of  the 


smoothing  constant  and,  in  that  regard,  can  only  be  reduced  at  the 
expense  of  destroying  the  most  compelling  reason  for  employing  it, 
namely,  reducing  the  weight  assigned  to  the  most  recent  observation 
to  zero . 

It  has  been  brought  to  the  writer's  attention  that  exponential 
smoothing  really  was  never  "invented"  for  the  constant  mean  model  in 
the  first  place.   Perhaps  so,  but  it  is,  nevertheless,  presently  used 
in  precisely  those  cases  and  hence  must  stand  on  its  own  merit  under 
scrutiny,  particularly  when  alternatives  are  available  that  appear  to 
do  a  better  job  for  an  equal  amount  of  effort.   Of  even  more  signi- 
ficance, however,  is  the  fact  that  exponential  smoothing  was  found  to 
be  second  best  even  in  one  case  where  the  mean  value  of  the  demand 
process  is  allowed  to  change  in  time.   These  results  are  reported  in 
Section  3. 

Before  turning  to  specific  results,  perhaps  a  remark  or  two 
regarding  random  demand  would  be  in  order.   Generally  speaking,  if 
demand  is  truly  random  and  the  values  of  these  random  variables  are 
used  to  set  reorder  levels,  or  in  general  estimate  parameters,  it  is 
inherently  part  of  the  model  that  the  resulting  values  will  fluctuate 
in  a  random  fashion  also.   There  is  no  way  around  this  point  and  usually 
the  best  we  can  hope  for  is  that  these  random  fluctuations  eventually 
dampen  about  some  ideal  or  hope-for  value.   First,  we  usually  try  to 
establish  that  at  least  these  resultant  processes  will  converge  to  a 
target  value  in  the  mean.   Thus,  it  is  desirable  certainly  to  be  able 


to  establish  that  random  reorder  levels  will  eventually  converge  in 
expected  value  to  \i  +   ka  whatever   p   and  a  happen  to  be.   But, 
such  convergence  is  not  enough.   Unless  the  variance  of  that  process 
goes  to  zero  in  time  there  is  no  assurance  that  the  process  is  in  any 
sense  close  to  the  required  value  regardless  of  how  long  the  system 
may  have  been  operating.   It  is  this  examination  of  variance  proper- 
ties of  exponential  smoothing  that  is  notably  lacking  in  the  published 
literature.   In  this  report,  such  considerations  are  included  in  a 
detailed  examination  of  several  models  currently  in  vogue. 

2.   NOTATION  AND  SUMMARY  PREVIOUS  RESULTS. 

Perhaps  it  is  unfair  to  indict  exponential  smoothing  as  being 
the  fundamental  problem  in  the  models  tested.   In  a  previous  report 
[2],  it  was  pointed  out  that  it  is  a  combination  of  exponential  smooth- 
ing with  the  use  of  mean  absolute  deviation  (MAD)  as  a  means  of  esti- 
mating variability  that  appears  to  create  the  major  difficulty.   To 
summarize  this  point  and  report  additional  results,  the  following 
notation  is  adopted. 

Let  Xn,X  ,X  , ...,X   be  a  demand  record  through  time   t.   We 
assume  for  this  section  that  these  are  mutually  independent  normal 
random  variables  each  with  mean  u   and  standard  deviation  a.      Follow- 
ing Brown  [3],  we  let  X     denote  the  forecast  at  time   t  -  1  of 
the  demand  in  the   t —  period  using  exponential  smoothing  of  the  data 
to  compute  its  value. 


t-2 
(2.1)   Xt_1  -  a  I      B  Xt-1__k  +6XQ     0<a<l;    6  =  l-a 

k=0 


By  using  this  basic  formula,  it  can  be  shown  [1]  that  E[X    ]  =  y 
for  all   t   so  that  we  may  view  (2.1)  as  an  unbiased  estimator  of 
mean  demand   u   from  period  to  period.   If  we  then  define  a  forecast 
error  at  time   t  by  means  of  the  formula 

(2.2)  e  =  X  -  X   . 

t    t    t-1 

then  it  follows  that  E(e  )  50. 

However,  as  previously  remarked,  the  variance  of  any  estimator 
must  also  be  examined.   In  a  previous  report,  we  established  that 


(2.3)  Vara   )  =  a  V  \ a2 

t-1       2  -  a 


Asymptotically  then, 

(2.4)  Var(X  .)  ->  -r-5 —  a2   as   t  ■*  « 

t-1    I   -  a 

Now  this  is  a  positive  constant  and  it  must  be  recognized  then  that, 
as  an  estimator  for   u,   X.  L      can  never  be  more  precise  than  this 
limiting  variance  allows.   In  other  words,  no  matter  how  long  the 
system  has  operated,  the  forecast  will  fluctuate  about   u  with  a 
variance  whose  size  depends  upon  the  unknown  variance  a2   as  well  as, 
of  course,  the  choice  of  the  smoothing  constant   a. 


The  same  remarks  can  also  be  made  about  the  forecast  error  e  . 

t 

Although  its  expected  value  vanishes  for  all   t,   it  too  has  a  limiting 
variance  bounded  away  from  zero  and  given  by  the  formula 


2t-l 

(2.5)     o2  =  lim  Var(e  )  =  lim  2  t  U =  T^—  a2 

e  t  2-a         2-a 


This  result  also  allows  us  to  write   a,   an  unknown  parameter,  in  terms 
of  the  limiting  standard  deviation  a   as, 

(2.6)  °'/^°e 

The  main  reason  for  noting  this  relationship  is  to  comply  with  the 

NavSup  procedure  for  estimating  a  by  means  of  estimates  of  a    . 

These  are  in  turn  found  by  smoothed  estimates  of  MAD.   In  the  normal 

case,  which  is  the  only  one  we  are  treating,   a   is  related  to  MAD, 

A   by  means  of  the  formula, 
e 

(2.7)  a  =  fir  A 

e    /  2   e 


Combining  this  with  (2.6)  yields 


(2.8)  a=^ZpXA 

2     e 


Exponentially  smoothed  estimates  of  A   are  obtained  by 
smoothing  forecast  errors.   By  formula, 


t-1 

(2.9)  A   =  a   y   B   le 

e  t- 

k=0 
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If  this  result  is  substituted  ad  hoc  into  (2.8)  one  then  obtains  the 
estimate 


(2.io)  o  =  /7T(j;-a)  A 

2     e 

consistent  with  formulas  established  by  Brown.   We  are  then  but  a 
step  away  from  the  formula  for  setting  a  reorder  level  using  smoothed 
estimates.   First,  the  constant  mean  is  estimated.   After   t   periods 
of  demand  have  been  observed,  mean  demand  is  estimated  by  means  of 
the  formula. 


t-1 
(2.11)  u  =  a  I      6  X 

k=0 


The  formula  ignores  initial  conditions  which  are  rendered" ineffectual 
in  time  anyway.   Since  the  claims  for  smoothing  properties  are  asymp- 
totic in  the  first  place,  this  represents  no  serious  modification  and 
yields  at  least  an  asymptotic  unbiasedness  wherein  E(u)  ■*■  u.   When 
this  estimate   is  combined  with  (2.10),  a  smoothed  estimate  of  the 
reorder  level  becomes 


*«w      r>s 


(2.12)  R  =  u  +  k  a 

where  k  is  chosen  to  satisfy  a  required  stock-out  risk  p   as  deter- 
mined by  (1.1) . 

As  previously  noted,  however,  the  true  risk  that  is  achieved 
by  using  (2.12),  or  indeed  any  formula  involving  only  estimates  of 
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y   and   a,   will  depend  on  how  well  those  parameters  are  estimated 
Of  special  interest  is  the  comparison  of  smoothed  estimates  with 
maximum  likelihood  methods  wherein, 

(2.13)  R  =  y  +  k  a 


1   t 
with  y  =  —  T   X.    and   a  = 


L  i-1 


being  the  ordinary  maximum  likelihood  estimates  of   y   and  a.   This 
point  was  the  subject  of  some  of  the  discussion  in  [2].   It  was  pointed 
out  there  in  several  ways  that  (2.13)  was  superior  to  (2.12)  in  case 
after  case.   Subsequent  examinations  by  Ornek  [4]  and  Coventry  [5] 
reveal  the  same  consistent  behavior. 

While  all  of  these  results  continue  to  be  based  on  simulations, 
the  consistency  with  which  exponential  smoothing  tends  to  produce  more 
variable  results  than  maximum  likelihood  cannot  be  ignored.   Moreover, 
there  is  now  some  theoretical  basis  for  this  claim.   Ornek  has  been 
able  to  establish  an  exact  formula  for  the  asymptotic  variance  of  A 
which  is  of  course  a  fundamental  quantity  used  in  the  computation  of 
a  reorder  level.   The  expression  is  complicated  and  is  not  duplicated 
here;  details  may  be  found  in  [4].   For  all  practical  purposes  approxi- 
mate values  with  a  high  degree  (within  10   )  of  accuracy  were  computed 
for  various  choices  of  the  smoothing  constant   a.   A  summary  appears 
in  Table  2.1. 
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a 

Var(A  /a2) 
e 

S.D.  of  a/a 

.10 

.0204 

.1745 

.15 

.0325 

.2173 

.20 

.0461 

.2553 

.25 

.0614 

.2905 

.30 

.0785 

.3239 

.35 

.0979 

.3561 

.40 

.1196 

.3876 

.45 

.1440 

.4187 

.50 

.1716 

.4495 

.55 

.2026 

.4802 

.80 

.4267 

.6341 

Table  2.1.   Variability  of  A  and  a. 

e 

The  table  amply  demonstrates  how  asymptotic  variability  increases 
with  the  choice  of   a  but  more  importantly  perhaps,  no  matter  how  long 
the  system  runs,  the  variance  of   a  never  approaches  zero  and  is 
bounded  away  by  a  positive  quantity.   This  means  that  estimates  of   a, 
and  hence  of   R,   the  theoretical  reorder  level,  are  doomed  to  fluc- 
tuate forever.   Not  so  for  maximum  likelihood.   It  is  well  known  that 
the  variance  of  a     goes  to  zero  with  increasing   t   (as  does  the 
variance  of  \i      of  course)   so  that  eventually,   R  and  R  coincide 
for  all  practical  purposes.   Put  another  way,  the  intended  risk  p 
and  the  actual  risk  attained  will  be  the  same,  whereas  the  same  state- 
ment  simply  cannot  be  made  about  R. 


11 


All  of  this  merely  supports  what  was  already  observed  in 
simulation  results.   Extending  those  results  already  established  in 
the  pilot  study  of  [2],  simulations  were  run  for  various  parameter 
pairs  and  the  risk  levels  compared  at  the  1,000   observation.   For 
each  of  several  such  parameter  pairs,  five  risk  levels  were  chosen. 
Then  actual  risks   p   for  smoothed  estimates  were  compared  with  actual 
risks  p   using  maximum  likelihood  techniques.   These  results  are 
reported  in  Table  2.2  and  they  pretty  well  speak  for  themselves.   The 
attained  risks,  as  measured  by  p,   are  consistently  nearer  the  target 
value  p   than  are  those  determined  by  p\   What  this  means  is  that 
even  after  the  system  has  operated  for  a  long,  long  time,  with  initial 
conditions  and  other  factors  stabilized,  the  actual  risk  attained 
when  reorder  levels  are  set  using  (2.12)  may  in  any  period  be  signi- 
ficantly different  from  the  value  that  presumably  was  being  attained 
by  the  choice  of  k. 

Another  way  to  view  the  greater  variability  involved  when 
smoothing  is  used  to  set  reorder  levels  over  a  long  period  of  time 
was  devised  by  Coventry.   For  this  experiment  parameter  values  of 
M  =  100  and   a  =  10  were  chosen.   Using  a  risk  level  of   .05,   the 
theoretical  reorder  level  would  be  116.45.   Demands  were  generated 
for  1,000  periods  and  the  reorder  level  using  R  and  R  was  checked 
at  the  1000   period.   This  experiment  was  then  replicated  100  times 
and  the  various  values  of  R  and  R  were  checked  and  plotted  against 
the  theoretical  reorder  level.   The  results  are  displayed  in  Figure  2.1 
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and  once  again  the  strikingly  larger  variability  in  R  may  be  noted. 
One  way  to  view  these  results  is  as  follows.   Think  of  100  supply 
centers  all  operating  under  the  same  reorder  rules  for  a  given  item. 
After  1,000  periods  (far  in  excess  of  the  number  of  periods  for  which 
records  are  typically  kept)  the  graph  may  be  viewed  as  showing  the 
actual  reorder  levels  that  would  be  set  at  the  various  centers,  first, 
all  using  R  and,  secondly,  all  using  R.   Again  the  results  speak 
for  themselves. 

3.   LINEAR  MEAN  MODEL. 

As  previously  remarked,  it  may  be  unfair  to  indict  exponential 
smoothing  on  the  basis  of  a  constant  mean  model  since  it  appears  to 
be  designed  more  for  models  which  are  more  time-dependent.   Indeed, 
at  the  very  heart  of  smoothing  techniques  is  the  idea  that  the  most 
recent  demands  are  more  indicative  of  the  true  demand  pattern  than 
are  the  earlier  ones.   For  a  constant  mean  demand  of  course,  that  is 
not  true  and  all  demands  reflect  the  true  pattern  equally  well.   But 
even  when  the  mean  is  changing  in  time,  this  idea  of  weighting  the 
most  recent  demand  heavily  must  not  be  carried  too  far.   For  determin- 
istic demands  there  can  be  little  argument,  but  when  demands  are  truly 
random,  sudden  increases  or  decreases  in  demand  are  to  be  expected 
even  with  a  stable  mean  and  there  is  a  question  of  just  how  much 
weight  should  be  assigned  these  random  fluctuations.   In  any  event 
the  system  can  be  studied  to  see  what  such  effects  are. 
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Perhaps  the  simplest  time-dependent  model  that  can  be  investigated 
is  the  case  where  demand  is  random  with  linear  mean  but  a  constant 
variance.   This  is  the  familiar  linear  regression  model  and  with 
normality  further  assumed  leads  to  standard  maximum  likelihood  esti- 
mates of  the  parameters  involved  and  once  again  presents  itself  as  an 
alternative  to  exponential  smoothing.   To  be  more  specific,  suppose 
demand  in  period   t   is  given  by 

(3.1)  X  =  a  +  bt  +  £   where   £    is  N(0,o2) 

Again  Brown  [3]  recommends  forecasting  demands  by  means  of  exponentially 
smoothed  estimates.   This  time,  since  two  parameters  are  involved,  a 
combination  of  single  and  double  smoothing  is  required.   More  specifi- 
cally, Brown  advocates  estimates 

(3.2)  xfc  =  2  Sfc(x)  -  S2(X) 

b  =  |  [St(x)  -  S2(X)] 

Since   u    =  a  +  b(t+l)  =  u  +  b,   it  follows  that  x  +  b"  is  a 
reasonable  way  of  estimating  u    .   In  these  formulas,   S  (x)   stands 
for  single  smoothing  applied  to  the  demand  record  X~,X  ,X  ,  ...,X 


r   k 

S4  (x)  =  a  I      3  X.  .  5    3  =  1  -  a 

L        k=0     t  * 


S2(x)   on  the  other  hand  represents  smoothing  applied  to  the  sequence 
S0(x),S1(x),S2(x),...,St(x)   so  that 
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t 
S2(x)  =  a  I      6K  S   .  (x) 
C       k=0     c  k 


It  should  be  noted  that   x   is  not  an  estimate  of   a  but 
rather  of   a  +  bt.   With  b  given  however,  one  can  estimate  a  by 
the  formula 

(3.3)  a  =  xt  -  Bt 

The  reason  for  this  observation  is  that  usually  in  regression  models 
of  this  type,  estimates  of  the  separate  parameters  are  given.   Indeed, 
in  this  notation,  the  maximum  likelihood  estimators  of   a  and  b   are 
given  by, 


t 


I   (k-kXJ^-x) 
(3.4)      b=^P_ ,     E.i.^lr| 

I    (k-k)2 
k=0 


t 

I 
i=0 


a  =  X  -  bk 

There  are  standard  formulas  that  may  be  found  in  almost  any  standard 

textbook  on  the  subject.   In  these  terms,  an  estimate  of   u  ,  -,  = 

j  t+1 

a  +  b(t+l)   would  be  given  by  a  +  b(t+l)  =  a  +  bt  +  b. 

This  leaves  the  unknown  parameter   a   to  estimate.   In  the 
theory  of  maximum  likelihood,  this  estimate  is  easily  derived  and  is 
given  by  considering  average  squared  deviations  about  the  fitted 
regression  line.   We  thus  have, 
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(3.5)  a  =  /±-     I    (3L  -a-bk) 

1  k=0  k 


Not  surprisingly  (in  terms  of  Section  2)  the  parameter  a   is  estimated 
in  exponential  smoothing  by  looking  at  weighted  absolute  deviations 
about  the  fitted  line.   Thus  we  first  let 

e   =  X   -  5T  r  b 
t     t     t-1 

be  the  difference  between  what  was  observed  and  what  was  forecast  and 

then  define 


~      t_1  k 

~t  =  a  I      6   let-l 
k=0 


With  normality  assumed  we  may  then  use 


~  _  /tt(2-cQ  ~ 
a  -   ^ At 


as  before  to  estimate  a. 

Once  we  have  estimates  of  the  various  parameters  of  course  we 
may  use  these  to  set  reorder  levels  once  again.   In  the  spirit  of  the 
preceding  section,  two  methods  will  be  compared  again.   First,  maximum 
likelihood  estimates  are  used  in  each  period  to  define 

AAA  A 

(3.6)  R  =  a  +  b(t+l)  +  k  a 

This  would  be  the  reorder  level  set  at  time   t  based  on  the  fact  that 
the  "best"  estimate  of  the  next  demand  would  be  a  +  b(t+l)   the 
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estimate  of  the  mean  y   1 .   If  exponential  smoothing  is  employed, 
then  the  reorder  level  would  be  set  at 


(3.7)  R  =  x  +  b  +  k  c 

based  on  the  same  kind  of  reasoning. 

How  do  these  two  methods  compare?   Again,  we  were  forced  to 
resort  to  simulation  for  reasons  that  are  even  more  pronounced  in 
this  case.   Generally  speaking,  and  not  too  surprising  perhaps,  the 
two  methods  compared  quite  favorably  with  each  other  when  attained 
risks  were  examined.   The  variability  in  the  smoothing  technique  was 
not  nearly  so  noticeable  as  it  was  in  the  constant  mean  case.   Never- 
theless, it  was  still  present  and  never  was  reduced  to  an  extent 
where  it  could  be  labeled  superior  to  maximum  likelihood  in  any  of 
the  cases  examined. 

First  of  all,  many  different  cases  (choices  of   a,   b   and  a) 
were  examined  by  Coventry.   For  each  parameter  choice,  estimates  of 
the  parameters  were  calculated  by  both  methods  after  100  periods  of 
demand  generated  to  satisfy  the  model  of  (3.1).   The  experiment  was 
then  replicated  100  times  and  results  were  then  averaged  over  these 
cases,  it  was  noted  that  the  results  appeared  to  be  independent  of 
parameter  choices  and  so  attention  was  focused  on  just  a  few  special 
cases. 

Typical  of  the  results  are  those  shown  in  Table  3.1  for  the 
choice  a  =  50,   b  =  2   and  a  =  5.   The  attained  risks  are  displayed 
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for  various  periods  and  for  this  case  averaged  over  1,000  replications 
of  the  experiment.   While  the  attained  risks,   p   using  smoothing  and 
p   using  maximum  likelihood,  are  both  reasonable  close  to  the  theoret- 
ical risk  p ,   it  should  be  noted  once  again  that  p   does  tend  to  be 
more  variable  with  no  consistent  pattern  of  change.   In  nearly  every 
case  p   does  exceed   p   however  and  that  in  itself  is  noteworthy. 

Number  of  Periods 


m 

10 

20 

50 

100 

p 

p 

P 

p 

P 

P 

P 

p 

P 

0.01 

.036 

.007 

.027 

.014 

.024 

.011 

.029 

.008 

0.05 

.103 

.045 

.070 

.045 

.078 

.049 

.068 

.053 

0.10 

.155 

.092 

.124 

.098 

.137 

.103 

.114 

.110 

0.25 

.306 

.233 

.264 

.256 

.274 

.249 

.249 

.256 

0.50 

.516 

.482 

.519 

.521 

.526 

.510 

.517 

.505 

Table  3.1.   Attained  Risks  Compared  to  Theoretical  Risk 


With  these  results  and  the  many  other  cases  examined,  it  is  now 
reasonably  safe  to  conclude  that  exponential  smoothing  is  not  a  superior 
estimating  technique  for  normal  demands  whether  constant  or  linear  in 
time.   At  least  this  is  so  when  stock-out  risk  is  the  major  criterion 
(as  it  often  is)  and  when  the  methods  presently  employed  by  NavSup  as 
advocated  by  Brown  for  setting  reorder  levels  are  compared  to  classical 
techniques.   Indeed,  depending  on  the  consequences  of  facing  an  attained 
risk  that  is  not  the  intended  one,  this  method  may  be  inferior  to 
ordinary  maximum  likelihood  techniques.   This  just  about  leaves  computing 
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ease  as  the  only  criterion  offered  by  smoothing  advocates  of  any  merit 
But  we  found  no  evidence  in  any  of  our  tests  that  smoothing  resulted 
in  any  significant  savings  in  computer  time  either.   In  most  cases, 
the  difference,  if  measurable,  was  negligible. 

4.   A  BAYES  PROCEDURE. 

In  actual  practice  it  was  found  that  neither  the  constant 
mean  model  nor  that  of  the  linear  mean  adequately  reflects  the  true 
nature  of  demand  even  when  the  assumption  of  normality  is  acceptable. 
The  model  that  comes  closest  to  reflecting  what  most  people  involved 
really  believe  in  (at  least  for  some  items)  is  that  demand  is  normal 
with  constant  mean  for  a  time,  perhaps  several  periods,  and  then 
shifts  to  a  new  mean  level  which  again  remains  constant  for  a  time. 
For  example,  in  times  of  conflict  there  may  be  a  sudden  increase  in 
demand  for  an  item  and  that  demand  has  a  mean  value  that  remains 
fairly  constant  for  the  duration.   But,  as  hostilities  cease,  the 
mean  demand  drops  to  a  lower  level  and  remains  there  while  the  circum- 
stances remain  stable.   Then  neither  of  the  preceding  models  apply 
exactly  although,  subject  to  the  general  remarks  previously  made, 
exponential  smoothing  should  be  a  good  candidate  for  such  a  model. 
The  reason  is  the  often  quoted  property  of  responding  to  changes  in 
demand  more  quickly  than  classical  methods. 

There  is  yet  another  technique  which  would  seem  quite  appro- 
priate for  a  model  of  this  type  and  that  is  to  apply  Bayesian  methods 
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sequentially  to  predict  or  forecast  demand.   The  basic  idea  is  to  use 
posterior  information  in  each  period  as  prior  information  for  the  next 
period.   Starting  with  some  initial  subjective  judgment  as  to  the 
parameters  involved  in  the  model,  one  can  then  proceed  to  use  the 
information  in  each  period  to  update  one's  guess  as  to  the  parameters 
to  come  and  forecast  accordingly.   After  all,  if  parameters  such  as 
mean  demand  are  truly  changing,  possibly  from  period  to  period,  then 
this  basic  Bayesian  approach  is  tailored  to  fit  precisely  that  kind 
of  situation. 

To  be  more  specific,  let  us  suppose  that  demand  is  still  normally 
distributed  but  the  mean  is  changing  possibly  from  period  to  period. 
Initially,  we  also  assume  that  the  variance  c2   in  the  initial  period 
is  known.   In  each  period  we  will  set  a  reorder  level  at  a  value 
\i*  +   k  a*     where  y*  and  a*     are  estimates  of  \i      and   a   for  that 
period  with  k  selected  again  in  order  to  achieve  some  nominal  risk 
p.   To  put  these  assumptions  into  the  Bayesian  framework,  we  initially 
assume  that  the  conditional  distribution  of  demand  X  given  a  value 
of  the  mean  u   is  normal  with  that  mean  and  a  known  variance  of  a    . 
As  to  the  mean  u,   we  suppose  that  the  prior  distribution  on  \i      is 
normal  with  some  mean  u_   and  variance  a2.   With  this  kind  of  normal 
on  normal  assumption,  it  is  easy  to  show  (see  [6]  for  example)  that 
the  posterior  distribution  for   u,   given  an  observed  demand  x  ,   is 
again  normal  with  mean  y   and  variance  a2   given  by, 
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Vl  +  a\ 

(4.1) 

2    i*2 


As  a  bonus,  if  we  take  loss  to  be  squared  error,  then  the  mean 
y   of  this  posterior  distribution  is  the  Bayes  estimator,  meaning 
that  it  minimize  the  Bayes  risk  for  the  problem.   (See  [6]  again  for 
details.)   As  such,   y..   and  a..   are  the  best  estimates — best  from 
a  Bayesian  point  of  view — of  the  parameters  that  exist  in  nature  at 
that  point,  namely,  after  one  observation.   Consequently,  a  logical 
Bayesian  reorder  level  would  be  set  at   y  +  k  a..   and  the  correspond- 
ing stock-out  risk  p   =  P(X>y.+ko  )   may  or  may  not  be  the  target 
value  p   depending  on  whether  or  not  the  mean  and  standard  deviation 
of  demand  in  the  second  period  are  or  are  not   y..   and  a   respectively. 

Before  proceeding  to  the  next  period  it  might  pay  to  pause  and 
analyze  the  significance  of  the  estimates  in  (4.1).   Re-writing  y 
as 


f  a6    1 

xl  + 

r  °2  ) 

H+°z 

^Q+°'A\ 

Ml  "     aTPo^  Xl  +     aTTo^  U0 


we  see  that  the  updated  estimate  of  the  mean  based  on  the  first  observed 
demand  x   is  just  a  weighted  average  of   x   and  y  ,   the  initial 
estimate  of  the  mean.   In  this  way,  the  weight  attached  to  the  observa- 
tion x   is  like  a  smoothing  constant  and  may  be  used  to  reflect  one  s 
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desires  or  beliefs  in  the  initial  states.   By  choosing  a2   small, 
very  little  (relatively)  weight  is  attached  to   x   compared  to   y  . 
This  is  as  it  should  be  for  if   c2   is  small,  then  the  prior  distri- 
bution is  concentrated  heavily  about  its  mean  yn  and  reflects  a  high 
degree  of  credence  in  that  initial  choice  y  .   On  the  other  hand,  if 
one's  initial  belief  in  yn   is  somewhat  weak,  this  can  be  reflected 
by  making   a2   relatively  large,  whence  more  relative  weight  is 
attached  to  what  is  actually  observed  in  x  . 

Fortunately,  this  same  basic  scheme  continues  from  period  to 
period  as  follows.   By  taking  the  prior  for  the  mean  in  period  2  to 
be  the  posterior  from  period  one,  the  posterior  distribution  for 
period  2,  based  on  observing  x  ,   the  actual  demand  during  that  period, 
is  again  normal  with  mean  y?   and  variance  a2   given  by  the  formulas, 

ag(x1+x2)  +  a\  gQ  °2 

y2  =     2c2  +  ^  '   °2  =  2a2Q  +   a* 


Proceeding  by  induction  in  this  manner,  it  is  easy  to  show  that  the 

posterior  distribution  at  the  end  of  the  period   t  based  on  having 

observed  x, ,x„,...,x   is  once  again  normal  with  mean  y   and 
1  2      t  t 

variance  o2   given  by 

t 

a0  I     Xk  +  °\ 
k=l 

yt  =    to2  +  a* 

(4.2) 

?   2 

2      a0  ° 

at  =  T^~-r^ 
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As  previously  remarked,  the  reorder  level  for  this  period  is  then  set 

at   y  +  k  a    . 

Once  again  it  may  be  seen  that  the  Bayes  estimate  of  mean 

demand,  as  given  by  the  mean  of  the  posterior  distribution,  is  a 

t 
weighted  sum.   This  time  the  total  observed  demand    £      x,   is  weighted 

k=l   k 
against  the  initial  estimate  yn.   It  is  significant  to  note,  moreover, 

that   lim  a2  =  0   so  that,  as  time  goes  on,  this  posterior  distribution 

is  becoming  degenerate  at   y  .   Consequently,  as  a  prediction  of  mean 

demand,  the  chosen  value  is  subject  to  less  and  less  fluctuation  as 

time  goes  on. 

So  much  for  theory.   To  determine  just  how  much  the  estimate 
of  mean  demand  is  affected  by  various  combinations  of  y_   and  a2 
and  to  see  how  it  compares  with  exponential  smoothing,  a  pilot  study 
using  simulation  was  conducted.   No  attempt  was  made  at  this  point  to 
examine  the  behavior  for  the  case  of  a  shifting  mean.   Rather  this 
study  was  confined  to  testing  the  procedure  for  internal  consistency. 
For  the  case  of  a  constant  mean  value  of  u  =  100  and  a  choice  of 
a  =  10,   random  demand  was  generated  for  100  periods.   The  Bayes 
estimate  was  then  computed  for  various  a  priori  combinations  of  y   = 
0,y,y/2,y/3,y/4,y/5  and  a2  =  a2 ,a2/2 ,a2/3,a2/4 ,a2/5 , 2a2 , 3a2 ,4a2 ,5a2 . 

The  results  are  displayed  in  Table  4.1  the  entries  being  the 
Bayes  estimates  or  posterior  means  after  5  and  100  periods  of  observa- 
tion.  Obviously,  the  closer  that   y    is  to  the  true  value  of   y 
(100  in  this  case)  the  better  the  resulting  estimate  is.   For  large 
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values  of  the  ratio   a2/c2,   it  should  be  noted  that  the  convergence 
to  100  is  fairly  rapid  even  for  poor  initial  guesses.   For  example, 
with  y   =  0  but  a2  =  5a2,   y   =  96.6   even  after  only  5  periods  of 
observation. 

Having  thus  tested  the  Bayes  technique  for  internal  stability, 
simulations  were  further  used  to  compare  the  technique  with  exponen- 
tial smoothing.   For  this  comparison,  mean  demand  was  estimated  by 
smoothing  techniques  using  the  formula, 


C   k 


=  S  (x)  -  a  I      g*  X 
k=0 


allowing  for  initial  conditions  S n(x)   other  than  zero.   Once  again 
parameter  choices   y  =  100  and  a  =  10  were  adopted.   As  a  first 
comparison,  the  least  favorable  initial  conditions,   yf  =  0  and 
S_(x)  =  0,   were  selected.   Estimates  of  mean  demand  over  various 
periods  were  then  made  for  a  variety  of  choices  of  the  weighting 
factor   a~   and  the  smoothing  constant  a.   The  results  are  reported 
in  Table  4.2  where  it  may  be  seen  that  Bayes  estimates  are  typically 
better  than  those  given  by  exponential  smoothing  when  roughly  the 
same  relative  weight  is  attached  to  the  observations.   Thus,  small 
values  of   a2   should  be  compared  with  small  values  of  a.   If  we  take 
a  =  0.2  as  presently  used  by  NavSup  as  a  guide,  then  almost  any  choice 
of   a2  will  do  better  in  the  early  stages  and  about  as  well  in  later 
periods . 
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y  =  100 

m0-o 

Numb 

er  of  P 

eriods 

a  =  10 

SQ(x)  =  0 

Estimator 

Parameter 
Values 

5 

10 

15 

50 

100 

^  =  °2 

83.7 

91.0 

93.7 

97.9 

99.1 

2o2 

91.3 

95.4 

96.8 

98.8 

99.5 

BAYES 

3a2 

94.1 

96.9 

97.8 

99.2 

99.7 

4a2 

95.6 

97.7 

98.3 

99.3 

99.8 

5a2 

96.6 

98.2 

98.7 

99.4 

99.8 

a  =  0.1 

41.1 

65.1 

79.3 

99.2 

100.4 

0.2 

67.3 

89.1 

96.4 

99.8 

100.6 

SMOOTHING 

0.3 

83.2 

96.9 

99.5 

99.9 

100.8 

0.4 

92.1 

99.0 

100.0 

99.9 

100.9 

0.5 

96.5 

99.5 

100.3 

99.9 

100.9 

Table  4.2.      Estimates   of      y   =   100 
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To  assess  the  effect  of  initial  conditions  or  risk,  the  same 
basic  model  was  used  to  generate  demands  for  100  periods.   Reorder 
levels  were  then  set  on  the  basis  of  a  k-value  to  achieve  a  theoretical 
risk  of   p  =  .05   using  both  techniques  and  the  actual  attained  risk 
was  then  recorded.   The  experiment  was  then  replicated  1,000  times 
and  the  attained  risks  averaged  over  these  replications.   The  results 
are  reported  in  Table  4.3  for  the  worst  initial  conditions   y   =  0  =  S  (x) 
and  in  Table  4.4  for  the  best  initial  conditions   y   =  100  =  S  (x)  . 

The  results  are  quite  remarkable.   Except  for  a  few  cases  the 
Bayes  method  provides  a  sample  risk  closer  to  the  theoretical  one  than 
does  exponential  smoothing  even  for  poor  initial  conditions.   In  both 
cases,  when  a  small  value  of  a   is  chosen  the  long  term  results  are 
fairly  accurate,  but  the  results  in  the  early  periods  are  far  from 
satisfactory.   For  large  values  of   a   the  results  in  the  early  periods 
are  better  but  only  at  the  expense  of  weaker  results  in  later  periods. 
By  comparison,  the  Bayes  technique  produces  about  the  same  results 
in  any  case.   For  large  weighting  constants   (a^  =  5a  )  ,   the  Bayes 
method  adjusts  quite  rapidly  and  the  long  term  results  are  all  fairly 
accurate. 

Of  course,  all  of  these  results  are  average  values,  averaged 
over  the  replications.   How  badly  they  vary  from  one  replication  to 
another  is  important  also.   To  check  on  variability,  the  sample 
standard  deviations  of  the  estimates  of   y   for  the  1,000  replications 
were  computed.   Those  values  are  reported  in  Table  4.5  for  the  case 
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y  =  100 

p0  =  o 

Number 

of  P 

eriods 

a   =  10 

SQ(x)  =  0 

Estimator 

Parameter 
Values 

5 

10 

15 

50 

100 

<=*2 

.634 

.294 

.187 

.079 

.071 

2a2 

.310 

.159 

.107 

.068 

.062 

BAYES 

3a2 

.211 

.128 

.090 

.066 

.061 

4a2 

.168 

.109 

.084 

.065 

.061 

5a2 

.146 

.097 

.080 

.065 

.060 

a  =  0.1 

1.000 

.992 

.737 

.067 

.063 

0.2 

.995 

.390 

.145 

.061 

.069 

SMOOTHING 

0.3 

.761 

.137 

.080 

.070 

.079 

0.4 

.365 

.088 

.080 

.076 

.083 

0.5 

.187 

.084 

.083 

.084 

.091 

Table  4.3.   Sample  Risks  with  Worst  Initial  Conditions 
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M  =  100 

M0  «  100 

Number 

of  P 

eriods 

o   =  10 

SQ(x)  =  100 

Estimator 

Parameter 
Values 

5 

10 

15 

50 

100 

og-oi 

.066 

.056 

.057 

.057 

.055 

2o2 

.069 

.057 

.058 

.057 

.055 

BAYES 

3o2 

.072 

.058 

.058 

.057 

.055 

4a2 

.073 

.059 

.058 

.057 

.055 

5a2 

.074 

.059 

.058 

.057 

.055 

a  =  0.1 

.060 

.051 

.057 

.057 

.063 

0.2 

.059 

.061 

.061 

.061 

.069 

SMOOTHING 

0.3 

.066 

.073 

.069 

.070 

.079 

0.4 

.073 

.077 

.079 

.076 

.083 

0.5 

.082 

.082 

.083 

.084 

.091 

Table  4.4.   Sample  Risks  with  Best  Initial  Conditions 
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li  =  100 

y0  -  100 

Numb 

er  of  P 

eriods 

a  =  10 

SQ(x)  =  100 

Estimator 

Parameter 
Values 

5 

10 

15 

50 

100 

2     2 
°0   =  ° 

3.95 

3.26 

2.84 

1.95 

1.69 

2a2 

4.25 

3.40 

2.94 

1.97 

1.72 

BAYES 

3a2 

4.36 

3.41 

2.94 

1.96 

1.73 

4a2 

4.46 

3.47 

2.95 

1.95 

1.72 

5a2 

4.48 

3.48 

2.97 

1.95 

1.72 

a  =  0.1 

2.31 

2.64 

2.68 

2.61 

2.73 

0.2 

3.41 

3.68 

3.63 

3.60 

3.65 

SMOOTHING 

0.3 

4.30 

4.51 

4.40 

4.49 

4.44 

0.4 

5.10 

5.27 

5.17 

5.30 

5.16 

0.5 

5.96 

6.03 

5.90 

6.06 

5.88 

Table  4.5.   Sample  Standard  Deviations 
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un  =  100  ■  S  (x) .   It  may  be  seen  that  for  early  periods  and  small 
choices  of   a,   the  smoothing  method  is  less  variable.   But,  as  more 
and  more  periods  are  taken,  the  Bayes  method  produces  less  variable 
results,  indeed  the  standard  deviation  consistently  decreases  with 
time.   On  the  other  hand,  smoothing  yields  results  that  appear  to 
have  about  the  same  variance  regardless  of  how  many  periods  are 
observed,  a  phenomenon  that  has  been  noted  before.   The  sample  stand- 
ard deviations  for  the  case  y  =  0  =  S  (x)   were  surprisingly  about 
the  same  as  the  most  favorable  case  and  are  not  presented  here. 

5.   CONCLUDING  REMARKS 

Regardless  of  what  else  might  be  said  about  exponential  smooth- 
ing  as  a  forecasting  tool,  it  now  seems  reasonably  safe  to  say  that 
the  results  tend  to  be  more  variable  than  some  other  alternative 
methods  that  are  available.   This  same  basic  theme  keeps  recurring  in 
model  after  model  and  case  after  case.   Claims  in  this  regard  have 
repeatedly  been  made  with  due  caution  throughout  this  and  earlier  studies 
due  to  the  simulation  techniques  employed.   Yet  the  consistency  of 
recurrence,  coupled  with  the  large  sample  sizes  used,  cannot  be  safely 
ignored.   In  some  isolated  cases,  we  have  supplied  a  theoretical  basis 
for  the  observations. 

It  is  practically  never  the  case  that  exponential  smoothing 
dominates  the  alternatives  studied  regardless  of  the  criterion  used 
for  comparison.   One  of  the  outgrowths  of  this  study  is  to  highlight 
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the  importance  of  variance  whenever  random  demand  is  faced.   It  is 
a  quantity  that  must  be  reckoned  with,  for  it  is  of  little  comfort  to 
the  individual  inventory  manager  to  know  that  his  technique  does  well 
on  the  average  unless  some  idea  of  the  variability  is  also  known. 
Of  the  alternatives  studied,  the  Bayes  method  of  Section  4 
seems  admirably  suited  to  a  model  where  mean  demand  is  constant  in  a 
given  period  but  subject  to  change  from  period  to  period.   The  method 
supplies  a  natural  and  appealing  method  of  incorporating  information 
on  a  prior  basis  to  update  estimates  sequentially  as  information  is 
gathered.   More  needs  to  be  done  with  the  method,  however,  before  it 
can  be  endorsed  over  other  alternatives.   This  would  be  the  basic 
recommendation  of  this  study,  which  should  be  viewed  only  as  an 
initial  pilot  study  of  this  technique.   Another  recommendation  would 
be  to  urge  all  users  of  exponential  smoothing  to  give  serious  consid- 
eration to  testing  other  alternatives  in  the  particular  context  of 
their  special  application.   Special  attention  should  be  paid  to  at 
least  replacing  MAD  as  a  method  of  estimating  variance.   This  much 
change  alone  may  produce  less  variable  results  and  thereby  make  a 
stronger  case  for  exponential  smoothing. 
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